27 research outputs found

    Transfer learning and sentence level features for named entity recognition on tweets

    Get PDF
    We present our system for the WNUT 2017 Named Entity Recognition challenge on Twitter data. We describe two modifications of a basic neural network architecture for sequence tagging. First, we show how we exploit additional labeled data, where the Named Entity tags differ from the target task. Then, we propose a way to incorporate sentence level features. Our system uses both methods and ranked second for entity level annotations, achieving an F1-score of 40.78, and second for surface form annotations, achieving an F1-score of 39.33

    Improving a semantic parser through user interaction

    Get PDF

    TwistBytes - identification of Cuneiform languages and German dialects at VarDial 2019

    Get PDF
    We describe our approaches for the German Dialect Identification (GDI) and the Cuneiform Language Identification (CLI) tasks at the VarDial Evaluation Campaign 2019. The goal was to identify dialects of Swiss German in GDI and Sumerian and Akkadian in CLI. In GDI, the system should distinguish four dialects from the German-speaking part of Switzerland. Our system for GDI achieved third place out of 6 teams, with a macro averaged F-1 of 74.6%. In CLI, the system should distinguish seven languages written in cuneiform script. Our system achieved third place out of 8 teams, with a macro averaged F-1 of 74.7%

    Swiss chocolate at CAp 2017 NER challenge : partially annotated data and transfer learning

    Get PDF

    Correction of Errors in Preference Ratings from Automated Metrics for Text Generation

    Full text link
    A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreement with human judgments. In this paper, we propose a statistical model of Text Generation evaluation that accounts for the error-proneness of automated metrics when used to generate preference rankings between system outputs. We show that existing automated metrics are generally over-confident in assigning significant differences between systems in this setting. However, our model enables an efficient combination of human and automated ratings to remedy the error-proneness of the automated metrics. We show that using this combination, we only require about 50% of the human annotations typically used in evaluations to arrive at robust and statistically significant results while yielding the same evaluation outcome as the pure human evaluation in 95% of cases. We showcase the benefits of approach for three text generation tasks: dialogue systems, machine translation, and text summarization

    LEDGAR : a large-scale multi-label corpus for text classification of legal provisions in contracts

    Get PDF
    We present LEDGAR, a multilabel corpus of legal provisions in contracts. The corpus was crawled and scraped from the public domain (SEC filings) and is, to the best of our knowledge, the first freely available corpus of its kind. Since the corpus was constructed semi-automatically, we apply and discuss various approaches to noise removal. Due to the rather large labelset of over 12'000 labels annotated in almost 100'000 provisions in over 60'000 contracts, we believe the corpus to be of interest for research in the field of Legal NLP, (large-scale or extreme) text classification, as well as for legal studies. We discuss several methods to sample subcopora from the corpus and implement and evaluate different automatic classification approaches. Finally, we perform transfer experiments to evaluate how well the classifiers perform on contracts stemming from outside the corpus

    Overview of the GermEval 2020 shared task on Swiss German language identification

    Get PDF
    In this paper, we present the findings of the Shared Task on Swiss German Language Identification organised as part of the 7th edition of GermEval, co-locatedwith SwissText and KONVENS 2020

    ZHAW-CAI at CheckThat! 2023 : ensembling using kernel averaging

    Get PDF
    We describe our approaches to sub-task 1A on multi-modal check-worthiness classification of the CheckThat! Lab 2023 in English. The goal was to determine whether a tweet is worth fact-checking based on its text and image content. Our submission was based on a kernel ensemble of different uni-modal and multi-modal classifiers. It achieved second place out of 7 teams with an F1 score of 0.708

    Correction of errors in preference ratings from automated metrics for text generation

    Get PDF
    A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreements with human judgments. In this paper, we propose to apply automated metrics for Text Generation in a preference-based evaluation protocol. The protocol features a statistical model that incorporates various levels of uncertainty to account for the error-proneness of the metrics. We show that existing metrics are generally over-confident in assigning significant differences between systems. As a remedy, the model allows to combine human ratings with automated ratings. We show that it can reduce the required amounts of human ratings to arrive at robust and statistically significant results by more than 50%, while yielding the same evaluation outcome as the pure human evaluation in 95% of cases. We showcase the benefits of the evaluation protocol for three text generation tasks: dialogue systems, machine translation, and text summarization

    TRANSLIT : a large-scale name transliteration resource

    Get PDF
    Transliteration is the process of expressing a proper name from a source language in the characters of a target language (e.g. from Cyrillic to Latin characters). We present TRANSLIT, a large-scale corpus with approx. 1.6 million entries in more than 180 languages with about 3 million variations of person and geolocation names. The corpus is based on various public data sources, which have been transformed into a unified format to simplify their usage, plus a newly compiled dataset from Wikipedia. In addition, we apply several machine learning methods to establish baselines for automatically detecting transliterated names in various languages. Our best systems achieve an accuracy of 92\% on identification of transliterated pairs
    corecore